Goto

Collaborating Authors

 sentence vector


Skip-Thought Vectors

Ryan Kiros, Yukun Zhu, Russ R. Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, Sanja Fidler

Neural Information Processing Systems

We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage. Sentences that share semantic and syntactic properties are thus mapped to similar vector representations. We next introduce a simple vocabulary expansion method to encode words that were not seen as part of training, allowing us to expand our vocabulary to a million words. After training our model, we extract and evaluate our vectors with linear models on 8 tasks: semantic relatedness, paraphrase detection, image-sentence ranking, question-type classification and 4 benchmark sentiment and subjectivity datasets. The end result is an off-the-shelf encoder that can produce highly generic sentence representations that are robust and perform well in practice.


Compositional Concept Generalization with Variational Quantum Circuits

Hawashin, Hala, Abbaszadeh, Mina, Joseph, Nicholas, Pearson, Beth, Lewis, Martha, sadrzadeh, Mehrnoosh

arXiv.org Artificial Intelligence

Personal use of this material is permitted. Abstract--Compositional generalization is a key facet of human cognition, but lacking in current AI tools such as vision-language models. Previous work examined whether a compositional tensor-based sentence semantics can overcome the challenge, but led to negative results. We conjecture that the increased training efficiency of quantum models will improve performance in these tasks. We interpret the representations of compositional tensor-based models in Hilbert spaces and train V ariational Quantum Circuits to learn these representations on an image captioning task requiring compositional generalization. We used two image encoding techniques: a multi-hot encoding (MHE) on binary image vectors and an angle/amplitude encoding on image vectors taken from the vision-language model CLIP . We achieve good proof-of-concept results using noisy MHE encodings. Performance on CLIP image vectors was more mixed, but still outperformed classical compositional models.


Style Extraction on Text Embeddings Using VAE and Parallel Dataset

Kong, InJin, Kang, Shinyee, Park, Yuna, Kim, Sooyong, Park, Sanghyun

arXiv.org Artificial Intelligence

This study investigates the stylistic differences among various Bible translations using a Variational Autoencoder (VAE) model. By embedding textual data into high-dimensional vectors, the study aims to detect and analyze stylistic variations between translations, with a specific focus on distinguishing the American Standard Version (ASV) from other translations. The results demonstrate that each translation exhibits a unique stylistic distribution, which can be effectively identified using the VAE model. These findings suggest that the VAE model is proficient in capturing and differentiating textual styles, although it is primarily optimized for distinguishing a single style. The study highlights the model's potential for broader applications in AI-based text generation and stylistic analysis, while also acknowledging the need for further model refinement to address the complexity of multi-dimensional stylistic relationships. Future research could extend this methodology to other text domains, offering deeper insights into the stylistic features embedded within various types of textual data.


SlideSpawn: An Automatic Slides Generation System for Research Publications

Kumar, Keshav, Chowdary, Ravindranath

arXiv.org Artificial Intelligence

Research papers are well structured documents. They have text, figures, equations, tables etc., to covey their ideas and findings. They are divided into sections like Introduction, Model, Experiments etc., which deal with different aspects of research. Characteristics like these set research papers apart from ordinary documents and allows us to significantly improve their summarization. In this paper, we propose a novel system, SlideSpwan, that takes PDF of a research document as an input and generates a quality presentation providing it's summary in a visual and concise fashion. The system first converts the PDF of the paper to an XML document that has the structural information about various elements. Then a machine learning model, trained on PS5K dataset and Aminer 9.5K Insights dataset (that we introduce), is used to predict salience of each sentence in the paper. Sentences for slides are selected using ILP and clustered based on their similarity with each cluster being given a suitable title. Finally a slide is generated by placing any graphical element referenced in the selected sentences next to them. Experiments on a test set of 650 pairs of papers and slides demonstrate that our system generates presentations with better quality.


Centroid-Based Efficient Minimum Bayes Risk Decoding

Deguchi, Hiroyuki, Sakai, Yusuke, Kamigaito, Hidetaka, Watanabe, Taro, Tanaka, Hideki, Utiyama, Masao

arXiv.org Artificial Intelligence

Minimum Bayes risk (MBR) decoding achieved state-of-the-art translation performance by using COMET, a neural metric that has a high correlation with human evaluation. However, MBR decoding requires quadratic time since it computes the expected score between a translation hypothesis and all reference translations. We propose centroid-based MBR (CBMBR) decoding to improve the speed of MBR decoding. Our method clusters the reference translations in the feature space, and then calculates the score using the centroids of each cluster. The experimental results show that our CBMBR not only improved the decoding speed of the expected score calculation 5.7 times, but also outperformed vanilla MBR decoding in translation quality by up to 0.5 COMET in the WMT'22 En$\leftrightarrow$Ja, En$\leftrightarrow$De, En$\leftrightarrow$Zh, and WMT'23 En$\leftrightarrow$Ja translation tasks.


Extractive text summarisation of Privacy Policy documents using machine learning approaches

Choi, Chanwoo

arXiv.org Artificial Intelligence

This work demonstrates two Privacy Policy (PP) summarisation models based on two different clustering algorithms: K-means clustering and Pre-determined Centroid (PDC) clustering. K-means is decided to be used for the first model after an extensive evaluation of ten commonly used clustering algorithms. The summariser model based on the PDC-clustering algorithm summarises PP documents by segregating individual sentences by Euclidean distance from each sentence to the pre-defined cluster centres. The cluster centres are defined according to General Data Protection Regulation (GDPR)'s 14 essential topics that must be included in any privacy notices. The PDC model outperformed the K-means model for two evaluation methods, Sum of Squared Distance (SSD) and ROUGE by some margin (27% and 24% respectively). This result contrasts the K-means model's better performance in the general clustering of sentence vectors before running the task-specific evaluation. This indicates the effectiveness of operating task-specific fine-tuning measures on unsupervised machine-learning models. The summarisation mechanisms implemented in this paper demonstrates an idea of how to efficiently extract essential sentences that should be included in any PP documents. The summariser models could be further developed to an application that tests the GDPR-compliance (or any data privacy legislation) of PP documents.


A Challenge Dataset and Effective Models for Conversational Stance Detection

Niu, Fuqiang, Yang, Min, Li, Ang, Zhang, Baoquan, Peng, Xiaojiang, Zhang, Bowen

arXiv.org Artificial Intelligence

Previous stance detection studies typically concentrate on evaluating stances within individual instances, thereby exhibiting limitations in effectively modeling multi-party discussions concerning the same specific topic, as naturally transpire in authentic social media interactions. This constraint arises primarily due to the scarcity of datasets that authentically replicate real social media contexts, hindering the research progress of conversational stance detection. In this paper, we introduce a new multi-turn conversation stance detection dataset (called \textbf{MT-CSD}), which encompasses multiple targets for conversational stance detection. To derive stances from this challenging dataset, we propose a global-local attention network (\textbf{GLAN}) to address both long and short-range dependencies inherent in conversational data. Notably, even state-of-the-art stance detection methods, exemplified by GLAN, exhibit an accuracy of only 50.47\%, highlighting the persistent challenges in conversational stance detection. Furthermore, our MT-CSD dataset serves as a valuable resource to catalyze advancements in cross-domain stance detection, where a classifier is adapted from a different yet related target. We believe that MT-CSD will contribute to advancing real-world applications of stance detection research. Our source code, data, and models are available at \url{https://github.com/nfq729/MT-CSD}.


Skip-Thought Vectors Ryan Kiros 1, Richard S. Zemel

Neural Information Processing Systems

We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoderdecoder model that tries to reconstruct the surrounding sentences of an encoded passage. Sentences that share semantic and syntactic properties are thus mapped to similar vector representations. We next introduce a simple vocabulary expansion method to encode words that were not seen as part of training, allowing us to expand our vocabulary to a million words. After training our model, we extract and evaluate our vectors with linear models on 8 tasks: semantic relatedness, paraphrase detection, image-sentence ranking, question-type classification and 4 benchmark sentiment and subjectivity datasets. The end result is an off-the-shelf encoder that can produce highly generic sentence representations that are robust and perform well in practice.


Learning Unsupervised Semantic Document Representation for Fine-grained Aspect-based Sentiment Analysis

Fu, Hao-Ming, Cheng, Pu-Jen

arXiv.org Artificial Intelligence

Document representation is the core of many NLP tasks on machine understanding. A general representation learned in an unsupervised manner reserves generality and can be used for various applications. In practice, sentiment analysis (SA) has been a challenging task that is regarded to be deeply semantic-related and is often used to assess general representations. Existing methods on unsupervised document representation learning can be separated into two families: sequential ones, which explicitly take the ordering of words into consideration, and non-sequential ones, which do not explicitly do so. However, both of them suffer from their own weaknesses. In this paper, we propose a model that overcomes difficulties encountered by both families of methods. Experiments show that our model outperforms state-of-the-art methods on popular SA datasets and a fine-grained aspect-based SA by a large margin.


Topological Interpretations of GPT-3

Sun, Tianyi, Nelson, Bradley

arXiv.org Artificial Intelligence

This is an experiential study of investigating a consistent method for deriving the correlation between sentence vector and semantic meaning of a sentence. We first used three state-of-the-art word/sentence embedding methods including GPT-3, Word2Vec, and Sentence-BERT, to embed plain text sentence strings into high dimensional spaces. Then we compute the pairwise distance between any possible combination of two sentence vectors in an embedding space and map them into a matrix. Based on each distance matrix, we compute the correlation of distances of a sentence vector with respect to the other sentence vectors in an embedding space. Then we compute the correlation of each pair of the distance matrices. We observed correlations of the same sentence in different embedding spaces and correlations of different sentences in the same embedding space. These observations are consistent with our hypothesis and take us to the next stage.